<html>
  <head>
    <title>BioHacker: The Network Coverage Problem</title>
  </head>
  <body>
    <h1><a href="http://biohacker.googlecode.com">BioHacker</a>: <em>The Network Coverage Problem</em></h1>

<h2>Given</h2>

<h3>Beliefs holding over all organisms</h3>

<p>
We believe in a set of chemical <em>reactions</em>, converting
<em>reactants</em> (inputs) into <em>products</em> (outputs).  These
reactions are organized into <em>pathways</em>, typically catabolisms,
converting nutrients into intermediary compounds, and anabolisms,
converting intermediary compounds into essential compounds. The notion
of <em>converter</em> from a set of inputs to a set of outputs
generalizes over reactions and pathways. A converter is either
primitive (a reaction) or compound (a pathway). A compound converter
is a graph, where each node represents a converter; each directed edge,
an inner input-output dependency; each dangling incoming edge, an input;
and each dangling outgoing edge, an output. With these primitives
(reactions), means of combinations (pathways) and means of
abstractions (converters), we organize our set of reactions
hierarchically.
</p>

<h3>Assumptions per organism</h3>

<p>
We assume an organism is capable of <em>catalyzing</em> a subset of
our universal set of reactions, because an organism has genes, and genes code for proteins, and proteins form enzymes, and
enzymes control reactions.
</p>

<h3>Beliefs holding per organism</h3>

<p>
For each organism, we believe in a set of compounds essential for
growth of the organism.  Crucially, we collect and also believe some growth
/ no-growth <em>experiments</em>.  In addition to an <em>outcome</em>
(growth / no-growth), an experiment specifies the presence/absence of
nutrients and genes.
</p>


<h2>Show</h2>

<h3>Underestimations / Completeness</h3>

<p>
If we enable the assumptions of a <em>growth experiment</em>, and we
<em>cannot</em> produce all essential compounds, we have a contradiction: our
assumptions of the organisms must contain <em>false negatives</em>,
i.e. some <em>additional</em> universal converters should be assumed.
</p>

<h3>Overestimations / Coherence</h3>

<p>
If we enable the assumptions of a <em>no-growth experiment</em>, and
we <em>can</em> produce all essential compounds, we have a contradiction: our
asssumptions of the organisms must contain <em>false positives</em>,
i.e. some assumptions must be <em>retracted</em>.
</p>

<h2>Efficienctly</h2>

<p>
By organizing the converters hierarchically, we can consider the
higher-level converters first, delving into their internals only when
necessary or requested.
</p>

<h2>Globally</h2>

<p>
We should not only be able to detect incompleteness/incoherence, but
also offer fixes.  We want to be able to offer global minimal fixes,
i.e. ones comprising a smallest set of additional assumptions and
retractions compatible with <em>all</em> the experiments at once
(assuming, of course, the experiments are consistent, which is the
least we can require to believe them true).
</p>

<h2>Experimentally</h2>

<p>
We should be able to offer suggestions for new experiments to confirm
or choose among our fixes. Examples:
<ul>
 <li>What are the minimal sets of nutrients sufficient for growth under a given fix?</li>
 <li>What genes are essential, given typical assumptions?</li>
</ul>

</p>

<h2>With Accountability</h2>

<p>
When we offer a fix, we also want to give a clear reason for why it
works (i.e. why is it sufficient? why is it necessary?).  We want to
know what exactly is the difference in the model before and after the
fix.  If we can quantify this difference, this would allow us to
choose a fix causing minimal perturbations to the model.  It's not
clear how to do this in a useful way: for example, if C is the only
missing essential compound, then an easy though unsatisfying fix is to
add C as a nutrient -- any other fix will have at least the same
consequences as this one, so a naive quantification wouldn't lead to
the best fix.
</p>
    <hr>
    <address><a href="mailto:namin@mit.edu">Nada Amin</a></address>
<!-- Created: Sat Apr 19 00:20:21 EDT 2008 -->
<!-- hhmts start -->
Last modified: Mon May  5 10:30:32 EDT 2008
<!-- hhmts end -->
  </body>
</html>
